print("Code Chunk")[1] "Code Chunk"
January 9, 2025
We are expected to have installed R and RStudio, if not see the installing R section.
In the discussion section, we will focus on coding and practicing what we have learned in the lectures.
Office hours are on Tuesday, 11-12:30 Scott 110.
Questions?
To insert a Code Chunk, you can use Ctrl+Alt+I on Windows and Cmd+Option+I on Mac. Run the whole chunk by clicking the green triangle, or one/multiple lines by using Ctrl + Enter or Command + Return on Mac.
Most of the functions we want to run require an argument For example, the function print() above takes the argument “Code Chunk”.
There are many data structures, but the most important to know the following.
c().$ operator.We work with various classes of data, and the analysis we perform depends heavily on these classes.
As you noticed, R did not identify the class of data correctly. We can change it using as.factor() function. You can easily change the class of your variable (as.numeric(), as.integer(), as.character())
Quite frequently, we will use additional libraries to extend the capabilities of R. I’m sure you remember tidyverse. Let’s load it.
If you updated your R or recently downloaded it, you can easily install libraries using the function install.packages().
Pipes (%>% or |>) are helpful for streamlining the coding. They introduce linearity to the process of writing the code. In plain English, a pipe translates to “take an object, and then”.
First task, install vdemdata in your console. Then, load the library.
This is the V-Dem dataset. For your reference, their codebook is available here.
The dataset is huge! Be careful
Imagine you are interested in the relationship between regime type and physical violence. Let’s select the variables we will work with. Quite unfortunately, the names of the variables are not as straightforward. The regime index is e_v2x_polyarchy_5C and Physical violence index is v2x_clphy.
Let’s rename the variables so it’s easier to work with them.
Now, analyze the regime data. We can describe regime data using various statistics. Let’s check the min score for the regime.
Check the max score for the regime variable below.
Check the average score for the regime variable below.
Finally, use the summary() function.
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0.0000 0.0000 0.0000 0.2224 0.2500 1.0000 1139
mutate(dem = case_when(ifelse(e_v2x_polyarchy_5C >= 0.5, 1, 0)))
| Statistic | Function | Example Usage |
|---|---|---|
| Minimum | min() |
min(x) |
| Maximum | max() |
max(x) |
| Mean | mean() |
mean(x) |
| Median | median() |
median(x) |
| Standard Deviation | sd() |
sd(x) |
| Variance | var() |
var(x) |
| Sum | sum() |
sum(x) |
| Summary | summary() |
summary(x) |
Base R vs Tidyverse
Useful functions, sample()
Visualizations
Tidyverse basics (mutate, filter, select, summarize, etc) Descriptive statistics Confidence intervals
| Function | Description |
|---|---|
select() |
Selects specific columns from a data frame |
mutate() |
Adds new variables or modifies existing ones |
filter() |
Filters rows based on specified conditions |
group_by() |
Groups data by one or more variables for subsequent operations |
summarize() |
Summarizes data by applying a function (e.g., mean, sum) |
case_when() |
Modifies a variable based on conditional logic |
rename() |
Renames columns in a data frame |
You can check how to use these commands in this scipt, or you can simply use the help option ?function().
First, we need to install R. Click the button below and click “Download and Install R”. Choose your OS. For Windows you need to download “base”; for MacOS and Linux you have to choose the version of your OS. Install.
For windows:
Second, we need to install RStudio. Click the button below and click “Download RStudio Desktop”. You will be redirected to your version automatically. Install.